An Adaptive Failure Detection Protocol

نویسندگان

  • Christof Fetzer
  • Michel Raynal
  • Frédéric Tronel
چکیده

The detection of process failures is a crucial problem system designers have to cope with in order to build faulttolerant distributed platforms. Unfortunately, it is impossible to distinguish with certainty a crashed process from a very slow process in a purely asynchronous distributed system. This prevents some problems to be solved in such systems. That is why failure detector oracles have been introduced to circumvent these impossibility results. This paper presents a relatively simple protocol that allows a process to “monitor” another process, and consequently to detect its crash. This protocol enjoys the nice property to rely as much as possible on application messages to do this monitoring. Differently from previous process crash detection protocols, it uses control messages only when no application messages is sent by the monitoring process to the observed process. This protocol has noteworthy features. When the underlying system satisfies the partial synchrony assumption, it actually implements an eventually perfect failure detector (i.e., a failure detector of the class usually denoted ). Moreover, if the average observed transmission delay is finite and the upper layer application terminates within a bounded number of steps for any failure detector in after the failure detector becomes “perfect”, then, when run with the proposed protocol, it also terminates correctly. These properties make the protocol attractive: it is inexpensive, implementable, and powerful. The paper also describes performance measurements of an implementation of the protocol.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Self-healing in payment switches with a focus on failure detection using State Ma- chine-based approaches

Composition, change and complexity have attracted ev- eryone’s attention towards Self-Adaptive systems. These systems, inspired by the human body, are capable of adapting to changes in the inner and outer environment. The main objective of this study is to achieve a more con- venient availability for e-banking services in the payment switch, using self-healing systems and focusing on the failur...

متن کامل

Self-healing in payment switches with a focus on failure detection using State Ma- chine-based approaches

Composition, change and complexity have attracted ev- eryone’s attention towards Self-Adaptive systems. These systems, inspired by the human body, are capable of adapting to changes in the inner and outer environment. The main objective of this study is to achieve a more con- venient availability for e-banking services in the payment switch, using self-healing systems and focusing on the failur...

متن کامل

ADAPTIVE ORDERED WEIGHTED AVERAGING FOR ANOMALY DETECTION IN CLUSTER-BASED MOBILE AD HOC NETWORKS

In this paper, an anomaly detection method in cluster-based mobile ad hoc networks with ad hoc on demand distance vector (AODV) routing protocol is proposed. In the method, the required features for describing the normal behavior of AODV are defined via step by step analysis of AODV and independent of any attack. In order to learn the normal behavior of AODV, a fuzzy averaging method is used fo...

متن کامل

A Rotating Roll-Call-Based Adaptive Failure Detection and Recovery Protocol for Smart Home Environments

Smart homes generally differ from other pervasive environments such as office environments. Homes are lack of system administrators to fix faulty services on the spot. Nevertheless, services in smart homes can be critical especially when they involve health and wellness services, since faulty services can lead to unexpected/undesirable consequences. Therefore, robustness and availability are tw...

متن کامل

How Bad Are Wrong Suspicions? Towards Adaptive Distributed Protocols

In this paper, we analyze the performance of consensus protocols based on the rotating coordinator paradigm. We consider a simulated production environment, on which processing and communication resources available for the different processes running the protocols are not necessarily the same. Firstly, we show that, in some scenarios, the performance of the consensus protocol is enhanced when t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001